Better Approximation Algorithms for NMR Spectral Peak Assignment
نویسندگان
چکیده
We study a constrained bipartite matching problem where the input is a weighted bipartite graph G = (U, V,E), U is a set of vertices following a sequential order, V is another set of vertices partitioned into a collection of disjoint subsets, each following a sequential order, and E is a set of edges between U and V with non-negative weights. The objective is to find a matching in G with the maximum weight that satisfies the given sequential orders on both U and V , i.e. if ui+1 follows ui in U and if vj+1 follows vj in V , then ui is matched with vj if and only if ui+1 is matched with vj+1. The problem has recently been formulated as a crucial step in an algorithmic approach for interpreting NMR spectral data [13]. The interpretation of NMR spectral data is known as a key problem in protein structure determination via NMR spectroscopy. Unfortunately, the constrained bipartite matching problem is NP-hard [13]. We first propose a 2-approximation algorithm for the problem, which follows directly from the recent result of Bar-Noy et al. [2] on interval scheduling. However, our extensive experimental results on real NMR spectral data illustrate that the algorithm performs poorly in terms of recovering the target-matching (i.e. correct) edges. We then propose another approximation algorithm that tries to take advantage of the “density” of the sequential order information in V . Although we are only able to prove an approximation ratio of 3 log2 D for this algorithm, where D is the length of a longest string in V , the experimental results demonstrate that this new algorithm performs much better on real data, i.e. it is able to recover a large fraction of the target-matching ? Supported in part by the Grant-in-Aid for Scientific Research of the Ministry of Education, Science, Sports and Culture of Japan, under Grant No. 12780241. Email: [email protected]. Part of the work done while visiting at UC Riverside. ?? Supported in part by a UCR startup grant and NSF Grants CCR-9988353 and ITR-0085910. Email: [email protected]. ? ? ? Supported in part by NSERC grants RGPIN249633 and A008599, and Startup Grant REE-P5-01-02-Sci from the University of Alberta. Email: [email protected]. † Supported by NSF Grant CCR-9988353. Email: [email protected]. ‡ Supported by the Office of Biological and Environmental Research, U.S. Department of Energy, under Contract DE-AC05-00OR22725, managed by UT-Battelle, LLC. Email: xud,[email protected].
منابع مشابه
Improved algorithms for 2-interval scheduling and NMR spectral peak assignment
We consider the 2-interval scheduling problem (2-ISP) defined as follows. We are given a discrete time interval I and a set J of jobs to be executed on a single machine during I. Each job v ∈ J requires either one or two contiguous time units of I and has a profit w(v, t) if v is started at time point t of I. Our goal is to maximize the total profit of the executed jobs. It has been recently sh...
متن کاملRIBRA-An Error-Tolerant Algorithm for the NMR Backbone Assignment Problem
We develop an iterative relaxation algorithm called RIBRA for NMR protein backbone assignment. RIBRA applies nearest neighbor and weighted maximum independent set algorithms to solve the problem. To deal with noisy NMR spectral data, RIBRA is executed in an iterative fashion based on the quality of spectral peaks. We first produce spin system pairs using the spectral data without missing peaks,...
متن کاملMore Reliable Protein NMR Peak Assignment via Improved 2-Interval Scheduling
Protein NMR peak assignment refers to the process of assigning a group of "spin systems" obtained experimentally to a protein sequence of amino acids. The automation of this process is still an unsolved and challenging problem in NMR protein structure determination. Recently, protein NMR peak assignment has been formulated as an interval scheduling problem (ISP), where a protein sequence P of a...
متن کاملAutomated backbone assignment of labeled proteins using the threshold accepting algorithm.
The sequential assignment of backbone resonances is the first step in the structure determination of proteins by heteronuclear NMR. For larger proteins, an assignment strategy based on proton side-chain information is no longer suitable for the use in an automated procedure. Our program PASTA (Protein ASsignment by Threshold Accepting) is therefore designed to partially or fully automate the se...
متن کاملNvAssign: protein NMR spectral assignment with NMRView
MOTIVATION Nuclear magnetic resonance (NMR) protein studies rely on the accurate assignment of resonances. The general procedure is to (1) pick peaks, (2) cluster data from various experiments or spectra, (3) assign peaks to the sequence and (4) verify the assignments with the spectra. Many algorithms already exist for automating the assignment process (step 3). What is lacking is a flexible in...
متن کامل